A Fast and Specific Alignment Method for Minisatellite Maps
نویسندگان
چکیده
BACKGROUND Variable minisatellites count among the most polymorphic markers of eukaryotic and prokaryotic genomes. This variability can affect gene coding regions, like in the prion protein gene, or gene regulation regions, like for the cystatin B gene, and be associated or implicated in diseases: the Creutzfeld-Jakob disease and the myoclonus epilepsy type 1, for our examples. When it affects neutrally evolving regions, the polymorphism in length (i.e., in number of copies) of minisatellites proved useful in population genetics. MOTIVATION In these tandem repeat sequences, different mutational mechanisms let the number of copies, as well as the copies themselves, vary. Especially, the interspersion of events of tandem duplication/contraction and of punctual mutation makes the succession of variant repeats much more informative than the sole allele length. To exploit this information requires the ability to align minisatellite alleles by accounting for both punctual mutations and tandem duplications. RESULTS We propose a minisatellite maps alignment program that improves on previous solutions. Our new program is faster, simpler, considers an extended evolutionary model, and is available to the community. We test it on the data set of 609 alleles of the MSY1 (DYF155S1) human minisatellite and confirm its ability to recover known evolutionary signals. Our experiments highlight that the informativeness of minisatellites resides in their length and composition polymorphisms. Exploiting both simultaneously is critical to unravel the implications of variable minisatellites in the control of gene expression and diseases.
منابع مشابه
Alignment of Minisatellite Maps: A Minimum Spanning Tree-based Approach
In addition to the well-known edit operations, the alignment of minisatellite maps includes duplication events. We model these duplications using a special kind of spanning trees and deduce an optimal duplication scenario by computing the respective minimum spanning tree. Based on best duplication scenarios for all substrings of the given sequences, we compute an optimal alignment of two minisa...
متن کاملgpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملImage alignment via kernelized feature learning
Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...
متن کاملMinisatellite linkage maps in the mouse by cross-hybridization with human probes containing tandem repeats.
Tests of 29 human variable number of tandem repeat probes in inbred mouse lines showed that 80% (23/29) cross-hybridize, and 48% (14/29) produce multiple band, minisatellite polymorphisms (fingerprint patterns). Mini-satellite-type polymorphisms detected by 11 probes were characterized in eight different strains; on average, 240 polymorphic differences were detected between pairs of strains. Re...
متن کاملAvoiding Ambiguity and Assessing Uniqueness in Minisatellite Alignment
Several algorithms have been suggested for minisatellite alignment. Their time complexity is high—close toO(n3)—due to the necessary reconstruction of duplication histories. We investigate the uniqueness of optimal alignments computed under the common single-copy duplication model. To this extent, it is necessary to avoid ambiguity in the algorithm employed. We re-code the ARLEM algorithm in th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2 شماره
صفحات -
تاریخ انتشار 2006